UNIVERSITY of WASHINGTON 
HUMAN SUBJECTS DIVISION 





GUIDANCE: Genomic Data Sharing 


1 PURPOSE 


To provide guidance to IRB members and HSD staff about the review of (1) research involving plans for 
sharing genomic data with NIH-designated repositories and (2) request for certification of the data. 


2 RELEVANCE 


Researchers that plan to submit genomic data to NIH-designated repositories must obtain institutional 
certification that data submission plans are consistent with NIH policies. 


The UW IRB is responsible for reviewing researchers’ genomic data sharing plans and consent forms to 
verify that NIH certification requirements have been met. See also SOP Request for Genomic Data 
Sharing Certification —Investigators and SOP Genomic Data Sharing Certification — HSD Procedures. 


3 DEFINITIONS 


This section provides definitions for key Genomic Data Sharing concepts, as described in NIH Policies. 


Coded: Any identifying information (such as name) that would enable the investigator to readily 
ascertain the identity of the individual to whom the private information or specimens pertain has been 
replaced with a number, letter, symbol, or combination thereof (i.e., the code) and a key to decipher the 
code exists, enabling linkage of the identifying information to the private information or specimens. 


Controlled-access: Data are available to an investi gator for a specific project only if certain stipulations 
are met. 


dbGaP (database of Genotypes and Phenotypes): A central data repository at the National Center for 
Biotechnology Information (NCBI), a branch of the National Library of Medicine. 


De-identified Data: Note that this definition is specific to NIH’s Genomic Data Sharing policy. Data that 
has been de-identified according to the following criteria: the identifiers of data subjects cannot be 
readily ascertained or otherwise associated with the data by the repository staff or secondary data users 
(45 CFR46.102(f)); the 18 identifiers enumerated at 45 CFR 164.514(b)(2) (the HIPAA Privacy Rule) are 
removed; and the submitting institution has no actual knowledge that the remaining information could 
be used alone or in combination with other information to identify the subject of the data. 


Large-scale genomic data: The GDS Policy applies to all NIH-funded research that generates large-scale 
human or non-human genomic data as well as use of these data for subsequent research. Large-scale 
data include genome-wide association studies (GWAS), single nucleotide polymorphisms (SNP) arrays, 
and genome sequence, transcriptomic, metagenomics, epigenomic, and gene expression data. Examples 
are included below. See Supplemental Information to the NIH Genomic Data Sharing Policy for more 
examples. 


e Sequence data from more than one gene or region of comparable size in the genomes of 
more than 1,000 human research participants. 


Version 1.2 #1910 
Implemented 02/05/2018 Page 1 of 9 

















W UNIVERSITY of WASHINGTON 


GUIDANCE: Genomic Data Sharing 
HUMAN SUBJECTS DIVISION 


e Sequence data from more than 100 genes or region of comparable size in the genomes of 
more than 100 human research participants. 


e Sequence data from more than 100 isolates from infectious organisms. 


NIH GWAS Data Repository: Also known as the “Database of Genotype and Phenotype (dbGaP)”, the 
NIH GWAS Data Repository is a database developed by the National Center for Biotechnology 
Information (a division of the National Library of Medicine) to archive and distribute the results of 
studies that have been investigated. 


NIH-designated repository: Any data repository maintained or supported by NIH either directly or 
through collaboration. 


Unrestricted-access: Data are accessible to anyone via public website (previously referred to as “open 
access”). 


UW 10: A Senior Official at the institution who is credentialed through NIH eRA Commons system and is 
authorized to enter the institution into a legally binding contract and sign on behalf of an investigator 
who has submitted data or a data access request to NIH. The UW Institutional Official who has the 
authority to provide institutional certification for data sharing under the GWAS and GDS Policies is the 
Grant and Contract Administrator processing the award. 


4 OVERALL CONSIDERATIONS FOR INSTITUTIONAL CERTIFICATION 


4.1 The Institutional Certification should state whether the data will be submitted to an unrestricted 
or controlled-access database. 


4.2. The Institutional Certification should assure that: 


4.2.1 The data submission is consistent, as appropriate, with applicable national, tribal, and 
state laws and regulations as well as relevant institutional policies; 


4.2.2 Any limitations on research use of the data, as expressed in the informed consent 
documents are delineated; 


4.2.3. The identities of research participants will not be disclosed to NIH-designated 
repositories; and 


4.2.4 An IRB, privacy board, and/or equivalent body, as applicable, has reviewed the 
investigator’s proposal for data submission and assures that: 


4.2.4.1 The protocol for the collection of genomic and phenotypic data is consistent 
with 45 CFR 46; 


4.2.4.2 Data submission and subsequent data sharing for research purposes are 
consistent with the informed consent of the study participants from whom 
the data were obtained; 


4.2.4.3. Consideration was given to risks to individual participants and their families 
associated with data submitted to NIH-designated data repositories and 
subsequent sharing; 
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4.2.4.4 Tothe extent possible, consideration was given to risks to groups or 
populations associated with submitting data to NIH designated repositories 
and subsequent sharing; and 


4.2.4.5 The investigator’s plan for de-identifying datasets is consistent with standards 
outlined in the NIH GDS Policy. 


4.3 These criteria for Institutional Certification are explained below. 


5 CONSISTENCY WITH APPLICABLE LAWS & POLICIES 


5.1  HSD staff identify applicable laws and policies as they would for the review of any application. 
5.2 Applicable policies include HSD SOPs and may include UW Privacy Policies. 


5.3. Applicable laws frequently include HHS human subjects protections regulations (45 CFR 46), FDA 
human subjects protection regulations (21 CFR Parts 50 and 56), the Health Insurance Portability 
and Accountability Act Privacy Rule (45 CFR Part 160 and Part 164, Subparts A and E), and WA 
State RCW 70.02: Health Care Information Access & Disclosure. 


5.4 Data submission must also be consistent with applicable tribal laws when the data are from 
American Indian and Alaska Native peoples. For example, tribal nations have jurisdiction over 
research conducted on tribal lands with tribal citizens. In general, the IRB relies on the researcher 
to provide relevant information about tribal laws as requested on the SUPPLEMENT: Genomic 
Data Sharing. 


6 DATACOLLECTION IS CONSISTENT WITH 45 CFR 46 


6.1 Data collection procedures must be consistent with HHS human subjects protections regulations. 


6.1.1 When the research involves the prospective collection of data or specimens, this is 
accomplished either by the UW IRB review process or by relying on an external IRB for 
review by means of an IRB Authorization Agreements. See SOP Authorization 
Agreements. 


6.1.2 When the research involves retrospective data, this is accomplished by confirming that 
data were collected with IRB approval. 


7 CONSIDERATION OF RISKS 


7.1 Risk assessment 


7.1.1 The IRB considers the risks associated with the genomic information in the event of re- 
identification and disclosure to minimize those risks, as well as in the context of the 
expected benefits of broad sharing. 
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7.1.2 


7.1.3 


The IRB also considers the extent to which genomic information associated with the 
participant could be used to identify an individual, or his or her family, by matching 
data sets to other sources of information. 


UW considers the sharing of genomic data through NIH-designated repositories to 
involve minimal risk provided the criteria below are met. It is important to note that 
sharing of genomic information through NIH repositories that does not meet these 
criteria is not inherently more than minimal risk. 

7.1.3.1. The expectations of the NIH GDS Policy or GWAS Policy are met; 

7.1.3.2 there is not a high risk of re-identification; and, 


7.1.3.3 results from secondary research using NIH data will not be returned to 
subjects. 


7.2 Risks of re- identification. 


7.2.1 


7.2.2 


7.2.3 


7.2.4 


7.2.5 


Currently, NIH-designated repositories that share genomic data do not meet the 
definition of human subjects research under HHS regulations at 45 CFR 46 because the 
data submitted to the repositories are collected solely for other research studies, and 
because the data are coded and the identity of the individuals from whom the data 
were obtained will not be readily ascertainable to the investigators maintaining the 
repository. 


NIH notes that this review and certification process goes beyond the requirements of 
45 CFR 46. However, NIH has implemented these policy requirements due to concerns 
that the evolution of genomic technology and analytical methods could increase the 
risk of re-identification and consequently risks associated with inadvertent or 
inappropriate use or disclosure. 


Technologies available within the public domain today, and expected technological 
advances, make the identification of specific individuals from their genomic 
information increasingly straightforward. 


The number of DNA markers, such as single nucleotide polymorphism (SNPs), that are 
needed to uniquely identify an individual is small. Data can be used with high certitude 
to confirm that two samples come from the same person. Nevertheless, the ease of 
identifying people from genomic data should not be overstated. This cannot be done 
without reference data and a high degree of expertise. 


Examples of populations that may be at a higher risk of re-identification include: 
7.2.5.1 geographically defined communities; 
7.2.5.2. members of ultra-rare disease groups; 


7.2.5.3. individuals who have engaged in illegal behavior (see 8.4 below). 











7.3 Risks associated with FOIA 
7.3.1 NIH-designated repositories are U.S. government records that are subject to the 
Freedom of Information Act. NIH is required to release government records unless the 
records are exempt from release under one of the FOIA exemptions. 
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7.3.2 


NIH believes the release of certain information to be an unreasonable invasion of 
privacy under FOIA exemption 6, 5 U.S.C. §552 (b)(6). Therefore, NIH foresees 
preserving the privacy of research participants and the confidentiality of genetic 
information by, for example, redacting individual-level genotype and phenotype data 
from any disclosures made in response to FOIA requests and the denial of unredacted 
requests. 


7.4 Risks associated with law enforcement 


7.4.1 


7.4.2 


Although NIH-repositories hold only coded data, it is conceivable that law enforcement 
agencies could ask for genomic information from the repositories, and, for example, 
search for matches to DNA for forensic purposes. Law enforcement might seek to 
compel disclosure of identifying information from the institution holding the identifying 
information. 


Release of identifiable information may be protected from compelled disclosure if a 
Certificate of Confidentiality is or was obtained for the original study. See SOP 
Certificate of Confidentiality. 


7.5 Potential harms to individuals, family members, specific populations, groups, and communities 


7.5.1 


7.5.2 


7.5.3 


7.5.4 


Harms that result from inappropriate use or disclosure of genomic data may include 
denial of employment or insurance. 


The Genetic Information and Non-discrimination Action of 2008 (GINA) provides a 
baseline level of protection against genetic discrimination in the United States. 


7.5.2.1 GINA is a federal law that prohibits discrimination in health coverage and 
employment based on genetic information. 


7.5.2.2. GINA does not protect against discrimination in the context of life insurance, 
disability insurance, or long-term care insurance. GINA’s protections apply to 
“asymptomatic” individuals, not those who have manifested disease. 


Harms may also include psychosocial harms such as stress, anxiety, stigmatization, or 
embarrassment resulting from disclosure of information about family relationships, 
ethnic heritage, or potentially stigmatizing conditions. 


Research has shown that some populations demonstrate a higher predisposition to 
developing certain diseases or disorders than others. Genetic variants associated with 
physical disorders, diseases, and behavioral traits and causative variants will be found 
in all populations with differing frequencies. Higher or lower frequencies that 
contribute to observed health patterns, particularly those that can be viewed 
negatively, can lead to genetic stereotypes and stigmatization of a population group. 


7.6 Return of individual research results 


7.6.1 


Return of individual research results to participants from research using data shared 
through NIH-repositories is expected to be an extremely rare occurrence. Nonetheless, 
the return of results must be carefully considered because the information can have a 
psychological impact (i.e. stress and anxiety) as well as implications for the participant’s 
health and well-being. While clinically valid and meaningful results can have a positive 
impact on an individual’s health, harms can occur if unvalidated research results are 
provided back to participants or used for medical decision-making. 
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7.6.2 Secondary investigators will not be able to return results directly to participants 
because they will not have access to the identities of these individuals. If a secondary 
investigator does generate clinically valid results of immediate clinical significance, he 
or she can only facilitate their return by contacting the contributing investigator who 
holds the key (if still maintained) to the code that identifies participants. 

7.6.3. When links to identifying information are retained, individual participants may be given 
the option of choosing or declining to receive results. If participants are given the 
option of receiving results, researchers should be aware that results may be returned 
years after they have submitted the study data to NIH. 


8 DE-IDENTIFICATION OF DATA IS CONSISTENT WITH GDS POLICY 


8.1 De-identification Requirements 

8.1.1 Data submitted to NIH-designated repositories must be de-identified and coded using a 
random, unique code. 

8.1.2 The 18 identifiers enumerated in the HIPAA Privacy Rule, and in the SUPPLEMENT: 
Genomic Data Sharing must be removed. 

8.1.3 Data should be de-identified such that the identities of the individuals from whom the 
data were collected cannot be readily ascertained or otherwise associated with the 
data by the NIH repository staff or secondary data users. 


9 INFORMED CONSENT 


9.1 Consent Requirements and Expectations for Geneomic Data Sharing 
9.1.1 Use the WORKSHEET: Consent Requirements and Expectations for Genomic Data 
Sharing to identify the applicable consent requirements for genomic data sharing and 
to determine whether the requirements are met. 


9.1.1.1 Ifthe consent requirements cannot be met, the same worksheet can be used 
to determine if the conditions are met for an exception to the consent 
requirements. Note that exceptions must also be approved by an Assistant 
Director of Operations (ADO) or the HSD Director. 


9.1.1.2 If neither the consent requirements or conditions for an exception can be 
met, the data cannot be certified unless subjects are re-consented for 
genomic data sharing. 





9.2 Studies involving minors 
9.2.1 If the study involves children, the IRB must consider the appropriateness of the 
continued maintenance and sharing of the data when the child reaches the legal age of 
consent. 
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9.2.2 


In particular, it is important to consider whether consent should be obtained from the 
now-adult subject. When a link to identifiers is maintained, researchers must provide 
the subject with the opportunity to withdraw data from the NIH-repositories, unless 
the IRB approves a waiver of the consent requirement for the now-adult subjects. See 
SOP Consent for information about the waivers. 


9.3 Studies involving consent by (LAR) legally authorized representative 


9.3.1 


9.3.2 


If the study proposes to obtain consent from legally authorized representatives, the IRB 
must consider the issues related to LAR consent as described in SOP Legally Authorized 
Representative. 


In particular, it is important to consider reconsent of subjects who regain the capacity 
to consent for themselves. When a link to identifiers is maintained, researchers must 
obtain consent from the subjects who regain the capacity to consent and provide the 
subject with the opportunity to withdraw data from the NIH-repositories unless the IRB 
approves a waiver of the consent requirement. 





10 DATA USE LIMITATIONS 


10.1 Consistency with Informed Consent 


10.1.1 


10.1.2 


10.1.3 


Through the Controlled Access process for providing data access to secondary users, 
mechanisms are in place to minimize the likelihood of usage of genomic data in ways 
that are inconsistent with the original informed consent. The IRB is expected to have 
reviewed all proposed submissions of data to NIH-designated repositories to ensure 
that the submission and subsequent sharing for research purposes are consistent with 
the informed consent of the study participants, certify the appropriate research uses of 
the data, and identify the specific data use limitations. 


The IRB accomplishes this by reviewing the terms of the consent form and 
documenting any limitations to use of the data, as expressed in the consent form, in 
the Data Use Limitations table in the GDS section of the WORKSHEET: Genomic Data 
Sharing Certification (which is ultimately included in the Institutional Certification). 


For example, if the consent form includes the possibility of data sharing but states that 
the data will only be used for the study or a particular disease, a disease specific data 
use limitation should be documented in the worksheet unless subjects are re- 
consented for broader use of the data. 


10.2 Four main categories of limitations. 


10.2.1 


In the document Points to Consider in Developing Effective Data Use Limitation 
Statements, NIH provides standard categories of data use limitations. The four main 
categories are: 


10.2.1.1 General Research Use 


10.2.1.1.1 Data can be used for any research purpose but would not be 
made available for non-research purposes. These data would 
generally be made available to any qualified investigator. 
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10.2.1.2 Health/Medical/Biomedical 


10.2.1.2.1 


Use of these data is limited to a focus on health/biomedical 
research objectives, excluding the study of population origins or 
ancestry. These data would generally be made available to any 
qualified investigator. 


10.2.1.3  Disease-specific 


10.2.1.4 


10.2.1.3.1 


Other 
10.2.1.4.1 


Data can only be used for research on a specific disease or a 
related condition. When informed consent documents allow the 
data to be used for future studies related only to a particular 
disease (e.g. diabetes and related conditions), a disease-specific 
limitation would be appropriate. 


These are data use limitations that are not included in the 
standard NIH categories that are specified by the certifying 
institution. 


10.3 Modifiers to the main categories. 
10.3.1 The following limitations are modifiers of the four main categories: 


10.3.1.1 Genetic studies only 


10.3.1.2 


10.3.1.3 


10.3.1.4 


10.3.1.5 


10.3.1.1.1 


Methods 
10.3.1.2.1 


Data can be used only for genetic studies. These may include 
research on the role of genetics in any disease, condition or non- 
disease trait. These may also include research that could have 
implications for understanding ancestral history because of the 
information that it may provide about allele frequencies in 
different populations. 


Data can be used for statistical methods research and 
development (e.g. development of statistical software or 
algorithms). 


Not-for-profit use only 


10.3.1.3.1 


Data can be used only for not-for-profit organizations. If the data 
should not be made available to commercial entities, this 
restriction should be stated specifically as a data use limitation. 


Publication required 


10.3.1.4.1 


Data can be used only if the secondary investigator will 
disseminate the study findings to the larger scientific community. 


IRB approval required 


10.3.1.5.1 Data can be used only with IRB approval from the secondary 


investigator’s institution. Documentation of local IRB approval, 
including a description of the type of review, e.g., full committee 
or expedited, would be submitted as part of the data access 
request. 
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